1 Introduction



2 Methodology


Data preparation

Searches with volume 0 are removed from the data set. For ahref analysis, only searches with volume > 100 are looked at. Analyses are based on US location.



Data enrichment

~2.5 million searches were enriched with ahref. This includes the statistics difficulty, return rate, clicks, region volume, and SERP features.



Overview of the data

Overview
Statistic Value
Total number of searches ~306 million
Total volume of searches ~303 billion
Searches with missing volume 0.51%
Mean search volume 989
Median search volume 10
Mean CPC 0.61



3 Research Findings


3.1 Top searches

The following is a list of the top searches (if you include all the misspelled searches for each site).

Searches with highest volume
Search Proportion
youtube 22.5%
facebook 5.6%
amazon 4.9%
google 4.0%
weather 1.5%
translate 1.0%
com 1.0%
instagram 0.9%
walmart 0.9%
ebay 0.8%
yahoo 0.7%
youtubecom 0.5%
you 0.5%
netflix 0.5%
news 0.4%
craigslist 0.4%
mail 0.3%
roblox 0.3%
gmail 0.3%
trump 0.2%
twitter 0.2%
map 0.2%
fox 0.2%
target 0.2%
123movies 0.2%
coronavirus 0.2%
nfl 0.2%
the youtube 0.2%
maps 0.2%
pinterest 0.2%
calculator 0.2%
ups 0.2%
espn 0.2%
classroom 0.1%
hotmail 0.1%
macy’s 0.1%
bitcoin 0.1%
linkedin 0.1%
nba 0.1%
msn 0.1%
usps 0.1%
food 0.1%
near 0.1%
tiktok 0.1%
login 0.1%
covid 0.1%
fox news 0.1%
tv 0.1%
games 0.1%
on 0.1%


Note that in the data set, a lot of the top searches are listed as misspellings of popular website. For example, in the table below are listed 10 of the highest searches by volume, if we do not group intended spelling. We can see that they are all attempts to go to Youtube. Note that for some of them the intended spelling was not recognized, so the percentages in the above table are an underestimate.

Searches with highest volume, no grouping
keyword location spell spell_type keyword_info_search_volume
youtubes 2840 1.85e+08
yotb 2840 youtube did_you_mean 1.85e+08
yuotube 2840 youtube showing_results_for 1.85e+08
fceb 2840 facebook showing_results_for 1.85e+08
iyou tube 2840 1.85e+08
youruvw 2840 youtube showing_results_for 1.85e+08
ajoutool 2840 youtube did_you_mean 1.85e+08
yourub 2840 youtube showing_results_for 1.85e+08
youtout 2840 youtube showing_results_for 1.85e+08
youi tue 2840 youtube showing_results_for 1.85e+08



3.2 Search volume

The top 2000 searches have extremely high volume, while the vast majority of the rest of the searches are very low volume.

(Note that if misspelled variants are grouped together, this figure will have an even more extreme skew.)



3.3 Spell types

If a misspelling is recognized, a so-called spell type is suggested. There are three types of spell shown in the table below. 1.4% of searches have a spell type, but those that do have tend to have high volume.

3.4 Questions in searches

~14% of searches are in the form of a question. “how” is the most common question word.

Although only about 1% of the total volume of searches are questions



3.5 Stopwords

“how” and “the” are the most common stopwords, which are present in 6-8% of searches.

A colorful version:



3.6. Keyword length

The most searched queries have length 6-9 characters, and falls continuously for search queries longer or shorter than that.


Search phrases with fewer words have higher volume



3.7 Keyword info categories

Internet & Telecom is the keyword category with the highest mean volume.

Arts & Entertainment, Internet & Telecom, and News, Media & Publications have the highest total volume.

Finance has the highest mean cost per click.

The average CPC for all searches is 0.61



3.8 Keyword difficulty

As volume increases, the difficulty increases.

The slope of the linear regression line is such that for each doubling of the volume, the difficulty increases by 1.63. For example, as the volume goes from 100 to 3200 (6 doublings), the difficulty increases by roughly 1.63 * 6 ~= 10.


Higher difficulty also means higher CPC on average. Note that the Y axis is logarithmic, so a small move on the y axis is a large increase in value.

An alternative visualization of the same data by grouped category in a boxplot:



3.9 SERP features

(Note there are (at least) two additional SERP feature types (Knowledge Panel and Videos), for which the sample size is too small to include.)

The SERP features featured in the most searches are Image pack and People also ask:

The knowledge card has a huge effect in reducing the clicks-per-search, while the other SERP features have limited effect. Searches with the Shopping results SERP feature have higher cps on average.

Easy keywords have fewer SERP features.

Thumbnail & Top stories is the most common SERP feature pairing.

Searches without SERP features tend to be low volume.

Searches with more SERP features have higher mean difficulty.



3.10 Return rate

We can see that searches with high return rates tend to have lower difficulty, and to be clicked on a lot more.

Comparison of searches with same volume but different return rates
return_rate mean_cpc mean_clicks mean_difficulty
very high 0.96 71423 18.4
low 0.70 15094 25.6



3.11 International searches

There is search data from 5 English-speaking countries.

Of those, US and UK have the highest search volume per person.

US has significantly higher cost per click on average

The following is based on analysis with ahref.

International searches have overall higher volume

Total search volume
region volume
US 33%
International 67%

Internationally there are more searches with very low volume, while US has more searches with medium volume.

There is not a large difference in the number of searches with very high volume. However, the total volume of these searches is a lot higher internationally

Searches that have high US volume tend to have high international volume, and vice versa. But there are some exceptions.

A version showing data points binned to hex tiles showing counts:

Searches that have much higher volume internationally
keyword us_volume international_volume
filmoviplex 10 295990
cloroquina 200 5869800
parivahan sewa 10 276990
jokaroom 10 173990
handball em 20 327980
Searches that have much higher volume in the US
keyword us_volume international_volume
football playoff schedule 602000 1000
frontier mail 586000 1000
spectrum mobile 526000 1000
chase bank near me 523000 1000
spectrum internet 998000 2000

Searches that have higher volume in US have a higher click-per-search on average than searches that have higher volume internationally.

They also have a higher cost-per-click on average

Searches that have higher volume internationally, tend to have higher difficulty